Ford Go Bike Communication¶

In this project I performed an exploratory analysis on data provided by Ford GoBike, a bike-share system provider, using Python visualization techniques. The goal is to figure out what variables possess the most influential power on a bike sharing service. I did the analysis on 2019 year data

Dataset structure¶

What is the structure of your dataset?¶

Columns:

  • duration_mins
  • start_day
  • end_day
  • start_month
  • end_month
  • start_hour
  • end_hour
  • distance_miles

What is/are the main feature(s) of interest in your dataset?¶

  • Understand he number of subscribers/customers in the dataset and which of them are more valuable to the company.

  • The time for which the bike is rented on an average and the distance in miles travelled by different classes of users.

  • Understand whether the start time of rentals differ between subscribers and customers.

  • The locations which sees the most rentals among different classes of users.

What features in the dataset do you think will help support your investigation into your feature(s) of interest?¶

  • Distance in Miles
  • Duration in Mins
  • Start Month, Day
  • Hour and Latitude and Longitude
  • Bike share for all trip data
  • User Type
  • Start Station Names

Univariate Exploration¶

Distribution of Rental duration in minutes¶

  • From Normal Plot: Distribution of rental duration is right-skewed, there are rentals for about an hour or so by users.

  • From the Log Plot: Rental duration is roughly bimodal.

Distribution of Customer vs Subscribers¶

  • The number of subscribers is larger than number of customers. 82.2% of users are Subscribers and 17.8% are Customers.

Day, Month and Hour vs Rentals¶

  • The peak hours of rentals are 7-9 am and 4-6 pm
  • The least number of rentals are on Saturday and Sunday
  • The most rentals are from August till October, March and April

How many enrolled in bike sharing scheme ?¶

/home/abdulrahman/anaconda3/lib/python3.8/site-packages/seaborn/_decorators.py:36: FutureWarning: Pass the following variable as a keyword arg: x. From version 0.12, the only valid positional argument will be `data`, and passing other arguments without an explicit keyword will result in an error or misinterpretation.
  warnings.warn(
  • 92.2% of subscribers are not eligible for Bike Share for All schemes and only 7.8% are eligible.
  • There are no customers who are in the Bike Share for all trip scheme.

Distance travelled by Subscribers and Customers¶

  • Distance traveled by subscribers travel 76.7% of the total distance and 23.3% by customers

Percentage of Rental duration for Subscribers and Customers¶

  • Subscribers rent 68.3% of the total time and this percentage is not as large as the Distance case.
  • Subscribers rent for less duration but travel more miles compared to Customers.

Bivariate Exploration¶

Distribution of rental duration in minutes (subscribers)¶

  • From the Normal plot: Most subscribers are renting the bikes for 4-11 minutes.
  • From Log Scale: it is bimodal.

Distribution of rental duration in minutes (customers)¶

  • From Normal Scale: Most Customers are renting the bikes for 7-13 minutes.
  • From Log Scale: it is bimodal.

Distribution of rental duration in minutes (all users)¶

  • From Normal Scale: Most Customers are renting the bikes for 4-8 minutes.
  • From Log Scale: it is bimodal.

Distance travelled in miles by Subscribers¶

  • Most Subscribers travel a distance 0.4-0.6 and 0.8-0.9 miles, they use it for transportation.
  • Huge number of customers are travelling less than 0.1 miles, they use it for fun.
  • For Bike share: they are using the bikes for day-to-day usage.

Duration in minutes of Rentals for Subscribers and Customers¶

  • Customers rent bikes for longer durations than Subscribers.
  • From violin plots the customers rent the bikes over a wide variety of ranges and for about 7.5 to 12.5 minutes.

Stations with the most customers rentals¶

  • From this data you can know where to concentrate your resources.
  • The top three stations are well known as tourist locations that's why it is popular.

Stations with the most subscribers rentals¶

  • Top locations are different from the Customers locations. as they use it for transportation, e.g: the first two are train stations.

Stations with the most bike share rentals¶

  • Top ten locations don't match with neither the Customers nor the Subscribers locations.

Multivariate Exploration¶

Distribution of Start Day and Start Hour for Customers¶

  • During weekdays start hour starts late for rentals.
  • During weekends demands peaks at the afternoon till the evening.
  • Customers could be tourists.

Heat map of Time vs Subscribers Rentals.¶

  • Most rental during 7-9 am and 4-6 pm.
  • less rental during weekends (sat and sun)

Locations¶

Top 40 locations of rentals for Customers¶

  • Most of the top 40 stations are in san francisco

Top 40 locations of rentals for Subscribers¶

  • Most of the top 40 stations are in san francisco and it is less random than customers renting locations

Top 40 locations of rentals for Bike Share for all users¶

  • No single concentrated area, and there is some distribution here.

Talk about some of the relationships you observed in this part of the investigation. Were there features that strengthened each other in terms of looking at your feature(s) of interest?¶

  • There is a relationship between time of rental and user type.
  • There is a relationship between Distance travelled, Rental Duration for Subscribers.
  • There is a relationship between user type and start station from where bikes are rented.

Were there any interesting or surprising interactions between features?¶

  • Subscribers rented the bikes for less time than customers but travelled more they seems to be in hurry.

  • Relationship between start Day, start Hour and user type which give a hint about the nature of the user type.

  • subscribers seems to be students and employees and customers seems to be tourists.

Summary¶

  • Subscribers uses bikes as transportation option.

  • Subscribers are regular customers who are making rides to/from work or school, renting a bike at 7-9am and 4-6pm on weekdays

  • Customers rent bikes for exploring the Bay area and they could be tourists.

  • Customers could be tourists who use bikes to explore the Bay area mainly on weekends.